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THE  EFFECT  OF  ITEM  SEQUENCE  ON  BAR  EXAMINATION  SCORES 

Stephen  P.  Klein,  The  Rand  Corporation 
Roger  Bolus,  GANSK  &  Associates 

-A.  Large  scale  testing  prograas  can  reduce  the  likelihood  of  one  exaainee 
copying  another's  answers  by:  having  sufficient  distance  between  seats, 
having  adequate  proctoring,  varying  answer  sheet  format,  and  using  multiple 
test  foras.  The  use  of  multiple  foras  usually  involves  having  one  fora 
contain  one  set  of  iteas  and  the  other  foras  contain  different  sets  of 
iteas.  In  other  words,  at  a  given  adainistration  of  the  examination,  all 
exaainees  do  not  answer  the  saae  questions.  Although  this  strategy  may  be 
sound  in  teras  of  psychometric  standards,  it  aay  be  inconsistent  with  the 
policies  of  the  organization  sponsoring  the  testing  prograa.  For  instance, 
the  National  Conference  of  Bar  Exaainers  requires  that  all  exaainees  taking 
the  Multistate  Bar  Examination  (MBE)  on  one  of  its  biannual  adainistrations 
answer  the  saae  set  of  questions. 

The  MBE  is  a  200  itea  multiple  choice  test  that  is  taken  by  about 
55,000  applicants  to  the  bar  each  year.  In  aany  states,  there  is  often 
substantially  less  than  adequate  seating  distance  aa»ng  exaainees.  This 
situation  has  led  to  several  incidents  of  cheating.  And,  cheating  on  a  bar 
exaaination  is  especially  serious  because  it  is  a  moral  character  violation 
that  aay  prohibit  an  exaainee  froa  practicing  law  even  if  he/she  retakes 
and  passes  the  exaaination. 

One  solution  to  the  foregoing  problea  is  to  use  multiple  test  forms 
that  differ  in  terms  of  the  order  in  which  the  iteas  appear.  This  strategy 
is  consistent  with  the  policy  of  having  all  exaainees  answer  the  saae  iteas 
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•nd  it  would  substantially  reduce  the  opportunity  for  cheating.  There  are 
two  suijor  concerns  with  this  approach:  (1)  soae  sequences  may  be  easier 
than  others  thereby  giving  soae  examinees  an  unfair  advantage  and  (2)  it 
aight  change  the  characteristics  of  iteas  used  for  equating  tests  across 
adainistrations . 

There  is  a  no  data  on  whether  essentially  random  variations  in  item 
sequence  would  change  the  psychometric  properties  of  a  test  or  its  iteas. 
Alaost  all  the  literature  on  itea  order  effects  comes  from  studies  with 
high  school  or  college  students.  These  studies  have  investigated 
systesMtic  rather  than  randoa  variations  in  itea  sequence  (such  as  froa 
easy  to  hard  versus  hard  to  easy)  and/or  the  effects  of  aixing  versus 
separating  itea  types  or  content  (such  as  quantitative  and  verbal  iteas). 
These  studies  are  therefore  not  especially  relevant  to  the  MBE  and  auy 
other  large  post  secondary  testing  prograas. 

PURPOSE 

'  The  present  study  was  conducted  to  determine  whether  varying  the 
sequence  in  which  blocks  of  iteas  were  presented  to  exaainees  would  affect 

test  and/or  itea  characteristics  ..-"There  were  two  reasons  for  studying  the 

;\ 

effects  of  varying  blocks  rather  this individual  iteas:  (1)  aany  tests, 
including  the  MBE ,  have  several  iteas  tied  to  a  coaaon  passage  and  (2)  it 
would  be  less  expensive  to  print  and  score  aultiple  forms  if  variation  was 
liaited  to  itea  blocks. 

SAMPLE 

The  saaple  for  the  study  consisted  of  2940  applicants  to  the  bar  in  a 
large  western  state.  These  applicants  were  encouraged  to  participate  and 
do  well  in  the  study  because  a  high  score  would  iaprove  their  chances  of 
passing  the  MBE  and  essay  portions  of  their  state's  bar  exaaination. 


INSTRUMENTS 


The  study  used  60  items  that  were  drawn  from  4  content  areas.  These 
items  had  appeared  on  previous  but  still  secure  versions  of  the  MBE.  The 
items  were  divided  into  two  sets,  A  and  B.  Each  set  contained  30  items. 

Two  versions  of  each  set  were  constructed.  Thus,  there  were  a  total 
of  four  forms:  A-l,  A-2,  B-l,  and  B-2.  The  first  10  items  on  form  A- 1 
were  the  same  as  the  last  10  on  A-2  while  the  last  10  on  A-l  were  the  same 
as  the  first  10  on  A-2.  Forms  B-l  and  B-2  followed  this  same  XYZ  and  ZYX 
pattern.  Table  1  shows  how  items  were  allocated  to  test  forms. 

Table  1 

ASSIGNMENT  OF  ITEMS  TO  FORMS 

Sequence  of  items 
Test  Form _ within  form 


A-l 

1-10, 

11-20, 

21-30 

A-2 

21-30, 

11-20, 

1-10 

B-l 

31-40, 

41-50, 

51-60 

B-2 

51-60, 

41-50, 

31-40 

PROCEDURES 

Applicants  were  assigned  randomly  to  4  groups.  Groups  1  and  3  took  an 
A  form  under  a  55  minute  time  limit  (which  is  the  normally  allowed  time  per 
itkm)  and  then  a  B  form  under  almost  totsl  power  conditions  (a  90  minute 
time  limit).  Groups  2  and  4  took  a  B  form  under  a  55  minute  time  limit  and 
then  an  A  form  under  a  90  minute  time  limit.  This  design,  which  appears  in 
Table  2,  provides  two  independent  tests  of  sequence  effects  under  the  55 
minute  time  limit  (Groups  1  vs  3  on  set  A  and  2  vs  4  on  set  B)  and  two 
independent  tests  under  the  90  minute  time  limit  (Groups  1  vs  3  on  set  B 


and  2  vs  4  on  set  A) . 


Table  2 


ASSIGNMENT  OF  FORMS  TO  GROUPS 


Group 


Tiae  liait 

1 

2 

3 

4 

55  ainutes 

A-l 

B-l 

A- 2 

B-2 

90  ainutes 

B-l 

A-l 

B-2 

A- 2 

A  =  iteas  1 

to  30, 

B  = 

31  to 

60 

RESULTS 

The  four  groups  had  alaK>st  identical  aeans  and  standard  deviations 
on  the  full  200  itea  MBE  (aeans  ranged  froa  428.8  to  430.4). 

The  average  Bean  score,  standard  deviation,  and  coefficient  alpha  on  a 
set  of  30  iteas  taken  under  the  55  ainute  tiae  liait  were  20.32,  3.96,  and 
.645  respectively.  The  corresponding  values  under  the  90  ainute  tiae  liait 
were  21.19,  3.88,  and  .650.  Table  3  shows  the  differences  in  these  three 
statistics  between  groups  under  each  tiae  liait  that  were  due  to  the 
variation  in  itea  order.  None  of  the  saall  observed  differences  in  test 
statistics  attributable  to  itea  sequence  even  approached  statistical  or 
practical  significance. 


Table  3 

DIFFERENCES  IN  TEST  STATISTICS  DUE  TO  ITEM  SEQUENCE 

Tiae  Groups  Itea  Mean  Standard  Coeff 

liait  coapared  set  score  deviation  alpha 

55  1  vs  3  A  .01  .05  .01 

2  vs  4  B  .12  .04  .01 


.06 


I 


6 

5 


.28 

.26 
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The  total  scores  on  s  fora  correlated  about  . 70  with  the  scores  on 
the  regular  200  itea  MBS  that  was  taken  on  the  following  day.  Variations 
in  itea  sequence  did  not  significantly  affect  this  relationship.  For 
instance.  Groups  1  and  3  had  r's  of  .71  and  .69,  respectively,  under  the  55 
ainute  liait.  Both  groups  had  an  r  of  .74  under  the  90  ainute  liait. 
Variations  in  sequence  also  did  not  affect  relationships  with  scores  on  the 
essay  portion  of  the  regular  bar  examination. 

Under  the  55  ainute  tiae  liait,  the  aeans  on  the  30  iteas  on  Fora  A- 1 
correlated  .98  with  their  aeans  on  Fora  A-2.  The  correlation  was  .99  under 
the  90  ainute  tiae  liait.  The  corresponding  values  with  the  B  forms  were 
.98  and  .96.  In  short,  itea  difficulties  as  well  as  total  test  statistics 
were  insensitive  to  variations  in  itea  sequence.  Correlations  aaong  z 
transformed  itea  biserials  averaged  .78  under  the  55  ainute  liait  and  .73 
under  the  90  ainute  liait,  however,  there  was  auch  less  variation  aaong  the 
biserials  on  a  fora  than  there  was  aaong  that  fora's  itea  difficulties. 

DISCUSSION  AND  CONCLUSIONS 

The  foregoing  findings  indicate  that  variations  in  the  order  in  which 
blocks  of  MBE  iteas  were  asked  had  little  or  no  effect  upon  test  or  itea 
statistics.  This  was  true  under  the  regular  time  per  itea  as  well  as  under 
almost  total  power  conditions.  Thus,  neither  an  exaainee's  score  or  the 
process  of  equating  tests  across  adainistrations  would  be  affected  by  the 
use  of  aultiple  forms  in  which  the  sequence  of  iteas  was  varied.  The  use 
of  such  foras  therefore  appears  to  be  a  psychoaetrically  sound  and  cost 
effective  aethod  for  discouraging  cheating  in  those  testing  programs  that 
face  the  saae  policy  constraints  as  are  encountered  on  bar  exaainations . 


